Constituent Parsing by Classification

نویسندگان

  • Joseph P. Turian
  • I. Dan Melamed
چکیده

We present an approach to constituent parsing, which is driven by classifiers induced to minimize a single regularized objective. It is the first discriminatively-trained constituent parser to surpass the Collins (2003) parser without using a generative model. Our primary contribution is simplifying the human effort required for feature engineering. Our model can incorporate arbitrary features of the input and parse state. Feature selection and feature construction occur automatically, as part of learning. We define a set of fine-grained atomic features, and let the learner induce informative compound features. Our learning approach includes several novel approximations and optimizations which improve the efficiency of discriminative training. We introduce greedy completion, a new agenda-driven search strategy designed to find low-cost solutions given a limit on search effort. The inference evaluation function was learned accurately enough to guide the deterministic parsers to the optimal parse reasonably quickly without pruning, and thus without search errors. Experiments demonstrate the flexibility of our approach, which has also been applied to machine translation (Wellington et al., 2006; Turian et al., 2007).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shallow Discourse Parsing Using Constituent Parsing Tree

This paper describes our system in the closed track of the shared task of CoNLL2015. We formulize the discourse parsing work into a series of classification subtasks. The official evaluation shows that the proposed framework can give competitive results and we give a few discussions over latent improvement as well.

متن کامل

Sentiment Composition

Sentiment classification of grammatical constituents can be explained in a quasicompositional way. The classification of a complex constituent is derived via the classification of its component constituents and operations on these that resemble the usual methods of compositional semantic analysis. This claim is illustrated with a description of sentiment propagation, polarity reversal, and pola...

متن کامل

Combine Constituent and Dependency Parsing via Reranking

This paper presents a reranking approach to combining constituent and dependency parsing, aimed at improving parsing performance on both sides. Most previous combination methods rely on complicated joint decoding to integrate graphand transition-based dependency models. Instead, our approach makes use of a high-performance probabilistic context free grammar (PCFG) model to output k-best candida...

متن کامل

Why is German Dependency Parsing More Reliable than Constituent Parsing?

In recent years, research in parsing has extended in several new directions. One of these directions is concerned with parsing languages other than English. Treebanks have become available for many European languages, but also for Arabic, Chinese, or Japanese. However, it was shown that parsing results on these treebanks depend on the types of treebank annotations used [ , ]. Another direction ...

متن کامل

Language Independent Dependency to Constituent Tree Conversion

We present a dependency to constituent tree conversion technique that aims to improve constituent parsing accuracies by leveraging dependency treebanks available in a wide variety in many languages. The technique works in two steps. First, a partial constituent tree is derived from a dependency tree with a very simple deterministic algorithm that is both language and dependency type independent...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005